Closed Bug 435865 Opened 16 years ago Closed 8 years ago

Long delays against some HTTP servers when using multiple persistent connections (network.http.max-persistent-connections-per-server)

Categories

(Core :: Networking: HTTP, defect)

x86
All
defect
Not set
major

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: robert, Unassigned)

References

()

Details

(Keywords: perf)

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b5) Gecko/2008043010 Fedora/3.0-0.60.beta5.fc9 Firefox/3.0b5
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b5) Gecko/2008043010 Fedora/3.0-0.60.beta5.fc9 Firefox/3.0b5

If network.http.max-persistent-connections-per-server is set to any value greater than 1 (which of course includes the default setting of 6), then against a number of HTTP servers I see long delays while Firefox seems to be waiting on the HTTP server and/or vice versa.

This seems to have regressed sometime, as Firefox 2 does not exhibit the same problem. I have tried all releases from 3.0b1 up to 3.0rc1, all have the same problem. 2.0.0.14 works fine.

Reproducible: Always

Steps to Reproduce:
1. rm -rf $HOME/.mozilla
2. firefox
3. select as default, remove "Latest headlines", disable phishing protection, stop firefox
4. su -c 'tcpdump -X -s 2000 "tcp port 80" > ff-3.0rc1.txt'
5. firefox http://www.facebook.com/
6. observe firefox taking forever to load facebook frontpage
7. stop tcpdump
8. stop firefox
Actual Results:  
Firefox takes forever to fully load page.

Expected Results:  
Firefox should take a second or three, like in older releases.
Both of the attached tcpdumps above were taken with network.http.max-persistent-connections-per-server=6

In other words I used the default for FF3 and changed the setting in about:config in FF2.

Notice that both dumps indeed show six connections are opened to 193.213.121.89. It seems this host ignores some of these connection attempts. This might be important information. Could it be that some hosts throttle against too many too fast incoming HTTP connections from the same client, by just ignoring the excess connections? And that this situation is not handled gracefully by Firefox 3?

The ff-3.0rc1 tcpdump shows that from timestamp 15:59:37.173085 the connection is wedged, both sides are doing some retransmits with no forward progress.
Then at timestamp 16:02:45.874159 things finally get going again.
This has been happening to me as well with Trunk builds for at least a month, and it is extremely annoying.  When you open a number of tabs, and if one tab isn't connecting, all the rest of the tabs will successively fail to connect.  This happens to me on facebook.com, and also when opening multiple google image search result URLs in succession.  Both of these sites involve accessing multiple pages at the same domain, facebook is obvious in this respect, but Google Images also is, as it now appends all destination search URLs to some kind of Google middle-man "proxy" URL.
I can confirm that changing it to 1 seems to largely fix the problem for me as well, so there is definitely something amiss in the trunk, and not just in Linux.

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9pre) Gecko/2008060407 Minefield/3.0pre ID:2008060407
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Linux → All
Summary: Long delays against some HTTP servers when using multiple persistent connections → Long delays against some HTTP servers when using multiple persistent connections (network.http.max-persistent-connections-per-server)
Robert, would you be willing to try the nightly builds at http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/ (you want the -trunk ones) to try to narrow down to a 24-hour window in which the Firefox 2 behavior changed to the Firefox 3 one?

I just tried http://www.facebook.com/ over here on Linux and can't seem to reproduce the bug; otherwise I'd just do it myself...
I think this is the cause of Bug 418993 as well.  I switch it to 6 and can reproduce the bug on Google Maps by zooming in and out very quickly with the scroll wheel.  When I switch it to 1 I can't get it to happen.
Boris, after a long session of binary-searching my way back in time, I find that:

2007-04-20-04-trunk is the last good version
2007-04-20-20-trunk is the where the bug was introduced

Hope this helps!
Checkins in that range:

http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=MozillaTinderboxAll&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-04-20+02&maxdate=2007-04-20+22&cvsroot=%2Fcvsroot

Nothing in that range is changing network code... :(

The only thing in that range that could obviously have affected what loads we do is bug 84582, and that seems pretty unlikely to lead to request patterns that can't be produced just by having a bunch of <img>s on a page.
Yeah, I checked with Bonsai and agree that the only likely suspect is the fix for bug 84582. Looking at that fix, isn't it about loading of CSS and dependencies on that? Both Facebook and Google maps will have a mix of CSS, images, javascript and HTML, doesn't sound entire unlikely that this is related?

I can still reproduce this with 100% certainty. What's weird is that the error reports aren't pouring in for such a fundamental and enduser visible problem. Only this bug and bug 435865.
What that change did was to switch from treating CSS loads like <script> loads to treating them more like <img> loads: not blocking the parser while waiting for the load.  In some cases this does allow us to make more parallel connections (the whole point of that patch, in fact).  But there were already cases where we could hit that many parallel connections...
I'm not seeing this anymore in the latest nightly.  Anyone else?

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1a2pre) Gecko/2008080703 Minefield/3.1a2pre ID:2008080703
Sorry, I'm still seeing this bug exactly as before.

I just tried with the 2008081102 version.

Followed the steps 1 to 8 as when I first entered this bug, still hangs for me.
(In reply to comment #13)
> Sorry, I'm still seeing this bug exactly as before.
> 
> I just tried with the 2008081102 version.
> 
> Followed the steps 1 to 8 as when I first entered this bug, still hangs for me.
> 

I don't know if anything related to this bug was recently fixed then, but Google Maps and Facebook aren't doing the hanging now that I attributed to this bug and Bug 418993 from before.  I'll test your original steps and get back here with my results.
Thanks for testing.

The important thing in order to trigger this is to start with a clean profile, or at least with a clean disk-cache.

If you're running the test with the page in question cached, the delay bug normally will not trigger.
Keywords: perf
facebook is running h2 these days.. to be actionable at this point we need fresh data
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: